DOCOMENT BESDHE 



ED 093 906 



TM 003 721 



AOTHOS 
TITLE 

POB DATE 
NOTE 



EDKS PBICE 
CESCEIPTOJRS 



IDENTIFIERS 



Taylor, Bob !• 

Potential Uses of the National Assessment Model at 
the State and Local Levels, 
[Apr 74] 

31p,; Paper presented at the Annual Meeting of the 
American Educational Research Association (59thf 
Chicago, Illinois, April 1974) 

MF-$0*75 HC-$1.85 PLUS POSTAGE 
Academic Achievement; ^Citizenship ; Curriculum 
Development; Educational Accountability; *Educational 
Assessment; Information Dissemination; *Models.; 
National Surveys; Objectives; *School Distnicts; 
estate Programs; Testing Programs ; Use Studies 
^National Assessment 



ABSTRACT 

The model used by Nationa 
gathering and reporting on the Citizenshi 
potential uses of the model for state and 
curriculum development, and aacountabilit 
The study was carried out using papers an 
Assessment of Educational Progress, Denve 
on adaptations of the model for state ass 
of the model for curriculum development w 
finally, adaptations of the model for acc 
suggested and discussed. (Author) 



1 Assessment .for data 

p area is described, and the 

local assessment, 
y purposes are .discussed, 
d reports from the National 
r office, and state reports 
essment needs. Adaptations 
ere identified, and, 
ountability purposes were 



u.s. de partment of health, 
education 8.welfare 
natjdnal institute of 
edl":ation 
this document has been re pro 
ouced exactly as received from 
the person or organi 2 :*tlon origin 
at i no it polnto of view or opinions 

STATED DO NOT NECESSARC.Y REPRE 
SENT OFFICIAL NATIONAL INST'TUTE OF 
EDUCATION POSITION OR POL'CY 



BEST COPY AVAILABLE 



POTENTIAL USES OF THE NATIONAL ASSESSMENT MODEL 
AT THE STATS AND LOCAL LEVELS 



By 

Bob L. Taylor 
University of Colorado 
Boulder, Colorado 



ASHA Annual Meeting 
Chicago, Illinois 
April, 197^ 



BEST COPY AVAILABLE 

POTE^^TIAL USES OF THE ^^ATIONAL ASSESSMENT MODEL 
AT THE STATS AND LOCAL LEVELS 

by 

Bob L. Taylor 
University of Colorado 

National Assessment is a plan for the systematic, census-like 
survey of knowledges, skills, understandings, and attitudes. It 
is an information gathering plan aimed at providing both educators 
and the lay public v^ith information concerning the level of achieve- 
m^ent in selected subject areas for students and young adults. The 
goal is to provide information that can be used to improve educa- 
tion. It is concerned with the achievement status of four age 
levels in ten different subject areas. The subject areas selected 
for assessment were: Art, Career and Occupational Development, 
Citizenship, Literature, Mathematics, Music, Reading, Science, 
Social Studies, and V/riting. 

THE ASSESSME^^T MODEL 

The assessment model is in the continuous process of being re- 
fined and improved, thus only the basic components of the model are 
presented in Diagram I. A circular scheme is used as the best way 
of presenting it since i*^ actual use its application may be initiated 
with any one of the components. Also, in its application, there are 
continual interactions between and among the various components. 
While theoretically the process starts with the refinement of overall 
national goals into specific subject matter, behavioral objectives, 



- 2 - 

and progresses in logical sequence through to the final Utilization 
of the Information, in practice there is much greater freedom with 
respect to the utilization of the components. 

The model for the Citizenship Assessment is presented here in 
outline form with a fairly detailed description of its components. 
As presented in Diagram I, there are seven basic components identi- 
fied in the model: Objectives Development, Exercises Development, 
Sampling Plan, Administration of Exercises, Scoring and Analysis, 
Reporting and Dissemination, and Utilization of Information. V/hile 
many of the fine points of the model are not developed in the fol- 
lov/ing outline, it is described in sufficient detail to give the 
reader a good understanding of how the data were collected and what 
implications might result from these data. The number of subtopics 
in the model and their distribution indicate that the major efforts 
of National Assessment have been with the first five components. 
Components six and seven have been areas of controversy and, therefore, 
have received less attention until recently. 



Outline of the Assessment Mode l for Citizenship 
I. Objectives Development^?^'^? ^ 



H/omer Frank B. , What Is. ^Ta tiona l Assessme nt? National Assessment of 
Educational Progress, Denver, Colo., 1970. 

2Norris, Eleanor L. (Sd.), Ci tizenship Objectives , Committee on 

Assessing the Progress of Education, Ann Arbor, Mich., 1969. 

^Cam.pbell^ Vincent N., et al, Ci tizenship Objective s for lB2hr2l 
' Assessmen t, Education Commission of the States, National 
Assessment of Educational Progress, Denver, Colo., 1972. 

^Campbell, Vincent N. and Daryl G. Nichols, "T^ational Assessment 
^ of Citizenship Education," S ocia l Education 32:279-81, June, 
1969. 

^Campbell, Vincent N. , et al. Repor t 2, Citizenship: National 

'Results, Education Commission of the States, National Assess- 
© . ment of Educational Progress, Denver, Colo., November, 1970. 

ERJC 



A. The task of developing objectives in the field of 
citizenship was awarded to the American Institute for 
Research of Palo Alto, California. These criteria 
were used in examining the objectives: 

(1) They were considered important by scholars. 

(2) They were accepted as an educational task by the 
school. 

(3) They were considered desirable by * thoughtful lay 
citizens . 

Scholars reviewed the objectives for authenticity 
with respect to their subject fields; school people 
reviewed the objectives in terms of their actual 
emphasis in their schools; and laymen reviewed them 
based on their own experiences with regard to their 
value in real life. 

B. The American Institute for Research staff reviewed 
previous lists of citizenship objectives and boiled 
these down to one comprehensive list of 20 objectives. 

C. Outstanding local teachers familiar with each target- 
age group (9, 13, 17, adult), working with the American 
Institute for Research staff, broke dov/n each general 
objective into the most germane behaviors deemed appro- 
priate as goals for a given age group. 

D. A selected group of students and adults in each age 
group v;as asked by the American Institute for Research 
staff to recall and describe outstanding citizens of 
their acquaintance and specific incidents reflecting 
good and poor citizenship. These Incidents and descrip- 
tions, about 1,000, were used to check the completeness 
of the initial list of objectives. 

E. The objectives were stated on three levels (general 
objectives, sub-objectives, and behavioral age illustra- 
tions or statements) . The results were sumnarized for 
each age group. 

F. The revised list of objectives, broken down into im- 
portant behaviors, v/as then deliberated on for three 
days by a panel of national leaders in citizenship 
education and related social sciences. 

G. A group of persons in various roles from selected 
California communities reviewed the objectives and made 
suggestions. These included public a^d private school 
administrators, counselors, teachers, a judge, a county 
planner, labor and business leaders, and social 
scientis ts . 



ERIC 



H. The objectives v/ere then reviev;ed by panels of laymen. 
Eleven lay review panels representing four geographic 
areas of the country a^d three different community sizes 
were used. Each panel spent two days reviewing the 
objectives based on these two questions: "Is this some- 
thing important for people to learn today?'^ and ''Is 
this something I would like to have my children learn 

II. Exercises Development^ 5 ^ 

A. The production of the exercises was initiated by the 

American Institute for Research in I966. The exercises 
v;ere developed to cover all of the major objectives and 
to represent the selected content areas. Many exercises 
required the use of interview techniques, as well as the 
usual pencil and paper exercises. Also, self report and 
group task exercises were used. 

3. Because '"^ational Assessment intends to describe what 

people in an age group knov/, the exercises were written 
to reflect three difficulty levels — to report knowledge 
or skills common to alm.ost all persons in an age group, 
to report skills or understandings of a typical member 
of an age group, and to report . under standings or knowledge 
developed by the most able persons in an age group. 

G. All exercises were developed to meet these criteria: 
content validity, clarity, functional exercise format, 
clustering exercises based on a single set of stimulus 
materials, directionality of response, difficulty level, 
content samplins;, and overlap between age groups. The 
exercises v/ere direct measures of som^e pieces of knowl- 
edge, understandings, attitudes, or skills which were 
m^entioned i^ one or more of the objectives. 

D. The exercises were reviewed by panels of lay persons for 
clarity, meaningf ul^ess , and invasion of privacy. 

E. There was a tryout of the exercises i'^volving repre- 
sentatives of groups in the actual assessment-«regions , 
communities , races , sexes , and age groups . Following 
the tryouts, the American Institute for Research staff 
a^d subject matter specialists reviewed the tryout data 
a^d m.ade needed revisions. 



^V/omer, oo. cit . 

/^Gadv/ay, Charles J. (Ed.) Reading and Literature : General Informatio n 
Yearbook, Education Commission of the States, Report 02-GIY, 
^'ational Assessment of Educational Progress, Denver, Colo., 
May, 1^72. 

^Finley, Carmen J. and Frances S. Berdie, The ^'^ational Assessment 
Approac h to Exercis e. Developm.ent . ^^ational Assessment of 
Educational Pro.°:ress, Ann Arbor, Mich., 1970* 



F. A committee of subject matter specialists, measurement 
specialists, and National Assessment staff members rated 
the exercises to be included in the packages according 
to a set of criteria, and based on the ratings the 
exercises were selected for use. 

G. The selected exercises were reviewed by U. S. Office of 
Education personnel for any infringement of privacy on 
the part of the respondents or possible of f ensiveness. 

H. Since there were about l60 minutes of testing time 
available for each age group in each subject area, the 
exercises used were only a small sample of the potential 
'^umber of exercises. The exercises were assembled into 
administrative units ( packages) for groups up to 12 
persons . 

III. Sampling Plan^'-^° 

A. The sampling plan was subcontracted to Research Triangle 
Institute, Raleigh, ^'^orth Carolina. A multi-stage 
design was used which was stratified by region, size of 
community, and socio-economic status. This v;as a proba- 
bility sample which allowed researchers to collect data 
from a small sample of the population and to infer from 
that sample certain characteristics of the entire popula- 
tion. 

B. The populations for assessment were all 9 year olds, all 
13 year olds, all 17 year olds, and all young adults 26 
through 35 years old in the 50 States plus the District 
of Columbia. The only exceptions were the exclusions of 
institutionalized individuals of these ages--those in 
hospitals, prisons, and others who could not be reached. 

C. For ages 9 and 13, a school sample only v;as used and for 
the 26 through 35 age group a household sample only was 
used. For the 17 year olds, both a school and a household 
sample were used. 

D. The entire country was divided into population areas as 
follows: cities, counties exclusive of cities, and 
pseudo-counties — tv/o or more counties were put together 
when the population of a single county v;as less than 
16,000. Each population unit of 16,000 residents v;as 
assigned a number. 



^'^orris, Eleanor L., et al. Report 1, 1Q69 - 1070 Scienc e.: National 
Results and Illustrations of Gro up Comparisons . J. R. Chrony 
and D. Horvitz, '^Structure of Sampling and V/e ighting, 
Appendix C, Education Commission of the States, ^"^ational 
Assessment of Educational Progress, Denver, Colo., July, 1970. 



lOi^.Torris, Citizenship Objectives ^ op . ext . 



- 7 - 



E. The country also v/as divided into four geographic re- 
gions: ^'ortheast, Southeast, Central, and V/est^ 

F. Each geographic region was divided into communities of 
four types: large cities of above 180,000 population, 
urban fringe, middle-sized cities between 25,000 to 
lRO,000 population, and small to^^m-rural of under 25,000 
population . 

The 52 sampling units for each geographic area were 
spread across the four community types in a fashion 
proportional to their population in relation to the 
area population. 

H, To insure comparable representation from each part of 
the country, an eoual number of sampling units was 
selected from each geographic region — 52 from each of 
the four regions for a total of 208, 

I. Sampling units were selected at random. This plan did 
not guarantee that all 50 States would be included in 
the sample. This v;as not a survey objective, hut later 
the design was changed so each state was included in 
the sample . 

J. In each samplinf? unit selected, all school buildings 
enrolling students of the sam.ple ages (public, private, 
and parochial) were identified. 

K. The plan for schools was to select samples of approxi- 
m.ately 250 to 350 pupils for each age group and from at 
least two different buildings within each sampling unit 
for each age group. 

L. Each cooperating building principal provided a list of 
names of students in the building from, the specific age 
groups. This list was used for the final random selection 
of students to take the assessment exercises from that 
building. 

M. Information about the areas v/as obtained from, the U. S. 
census data. In order to insure reliable information 
for lower socio-economic status groups, these groups 
were oversampled. There was a disproportionate number 
of schools from lower socio-economic status areas in- 
cluded. In the overall results, the data from the lov/er 
soci o-economic areas were given the percentage value in 
v/hich they occurred in the total population . 

From, each of the 208 geographical sam.ples, 100 adults, 
ages 26 through 35, were randomly selected using the 
followine; procedures. Each of the 208 geographic samples 
v/as divided into equal secondary sampling units. Then 
ten secondary sampling units were randomly selected from 
the total 208 samples. Interviewers then personally 
contacted the subjects in the chosen secondary sampling 
units of the 26 through 35 age group and out- of-school 
17 year olds. These persons were asked to participate 
in the assessment. 



ERIC 



- 8 - 



0. Individuals were classified as black, v;hite, and other 
on the basis of information provided by the school or 
by observation. Results were given for black and white 
only. The number of individuals classified as other 
was too small to produce reliable results. 

IV. Administration of Exerc ise s ^•'■^ 



A. The administration of the exercises v^as subcontracted 
to Research Triagle Institute in the East and to 
14easurement Research Center of V/estinghouse Learning 
Corporation, Iowa City, Iowa, in the V/est. Cooperation 
of schools was obtained by first contacting officials 
at the state and then at school district levels. There 
was above 90 per cent cooperation by schools. 

Adults and out-of-school 17 year olds were contacted 
by a personal door-to-door household canvass. Each 
out- of -school participant was contacted individually. 
The right of each to refuse to participate was respected. 

B. A full-time trained staff of 2? district supervisors 
managed the field work. They were assigned to different 
geographical areas of the United States. They contacted 
schools and recruited and trained local teachers to 
help in the administration of the exercises in schools 
and recruited and trained other available persons for 
the out-of-school administration. 

C. In the schools, students from a single age group from 
different classes were" brought together in a room for 
exercise administration. Group size was at least 8 
and usually 12 students. 

D. The exercises were organized in packages which contained 
exercises from two or three different subject areas at 

a single age level, ^^o one person took all the exercises 
in his age group. Age groups were assessed at different 
tim.es of the year. 

E. In packages administered to groups, taped directions and 
taped readings of the exercises were used in addition 

to printed packages. This ^^ras done to establish con- 
sistency in tim.ing and administration plus to provide 
for nonreaders . 

F. Several packages at ages 9, 13^ and 17 consisted of 
exercises that were given by exercise adm.inis trator s to 
one individual at a tim.e. The administration of all the 
packages for the adult assessment was done by interviews. 



11 



'V/omer, o^. cit . 



1 p 

^ -^"^Gadway, ofi. cit . 

ERIC 



- g - 



G, Each package required about 50 ni'^utes of administra- 
tive time. Each person took only o^e package with the 
exception of the out-of-school, 17 year elds who were 
asked to take four or five packages each since they 
were the most difficult and expensive group to locate. 



H. Students' names were confidential and did not appear 

on any packages. The name roster was kept at the build- 
ing level and used o^ly in the organization of the in- 
school sampling, 

V. Scoring and Analysis -^'-^^ 



A. The scoring and analysis of the exercises were sub- 
contracted to Measurement Research Center of V/estinghouse 
Learning Corporation^ Iowa City, Iowa. 

B. The multiple-choice exercises were scored and recorded 
routinely by machine. 

C. The openended exercises were scored by trai»-ed scorers 
using a key of acceptable and unacceptable achievements 
in terms of the objectives. 

D. Results were reported for each objective. Also, the 
results were reported both as the percentage of any group 
of respondents making the desired responses to an exercise 
and as the difference between the percentage of a group 
making the desired responses and the corresponding national 
percentage . 

E. In the assessment, there \^3.s a lack of prop'or tionality 
among characteristics used in the comparison of^groups, 
such'^as color, sex, parental education. A statistical 
procedure, balancing, v;as used to correct for this 
problem in the comparative analysis of the data. Balanc- 
ing is a orocedure to examine the performance of groups 
classified on one characteristic adjusting for the fact 
that these groups differ on a specified set of other 
characteristics • 

VI. Reoorting a^d Dissemination^^'^^^^'^ 



''"'^V/omer, oq. cit. 
■'-^Gadway, od. cit . 
1 5v/ome r , oo, . cit . 

l^Campbell, R enort 2, Ci tizenship : lational Results, 02. cit. 
l^Gadv^ay, o^. cit. 



ERLC 



- 10 - 



A. The reporting of results \»/as directed to subject matter 
specialists, professional educators, and i'^formed laymen. 
Multiple reports were developed to serve these different 
audiences . 

3, Approximately ^0 per cent of the exercises were made public 
at the end of each assessment year, ^'^ot all exercises 
\^ve so reported since they \^re to be used over again in 
future assessments in order to measure change by means 
of comparing the results on the uncon taminated exercises. 

C. The exercises released for publication viere selected to 
be representative of all exercises administered as well 
as the results received on the assessment. 

D. ^.sporting was done by 9, 13, 17, and 26 through 35 age 
croups. Since the same exercises \^re used with dif- 
ferent age groups, there \«/ere comparable data across two 
or more age levels . 

S. Reporti'^e was also done by groups withi'^ the categories 
of * regions, com^munity types, sex, socio-eco'^omic status, 
and white ^ black, and other. 

F. Final reports were printed with a short description of 
the exercises, the national percentage of success, and 
group differences from the national percentage of 
success for each exercise. This was done without any 
interpretation of results. 

G. 3oth observed and balanced results for all exercises 
and by ^rouos v^re reoorted* The effects of balancing 
on measured" characteristics such as sex and region were 
included in the report. 

H. There were ^o scores reported for individuals, "'o 
single individual took more than one twelfth of the 
exercises, a^d no individual took a package that sampled 
only one subject area. 

I. Results were reoorted through the media: witten word, 
radio, television, films, and personal reports. 

,18,19 



VII. Utilization of Information" 



1 R 

V/om.er, ^ja* ^iS* 

^^Conwav Larry S., '^Some Imolications of the ^^ational Assessment 

Model and Dati for State and Local Education, - Paper Presented 
at the 1Q73 An-ual Meeting of the A3RA, ^^ew Orleans, 
Louisiana, February 26, 1P73* 



ERIC 



- 11 - 



A. The rewSults provided potential information for educational 
decision making , For example, considering the somewhat 
lov/er perform.ance of the Southeast Region on the Citizen- 
ship results, school boards in that region might decide 

to conduct their local investigation to determine the 

status of their school programs on citizenship skills, 
understandings, and attitudes. 20 

B. The results raised many questions which m.ay lead to 
other investigations. For example, in making compari- 
sons of all Citizenship results combined, it was found 
that the Extreme Affluent Suburbs results were above the 
national median at all ages and that the Extreme Rural 
and Extreme Inner City results were below the national 
median at all ages. Here are discrepancies in performance 
v/hich need to have causal studies conducted on them from 
the perspectives of different disciplines such as political 
science, sociology, economics, and education. 21 

C. The results of several cycles should provide evidence 
of changes in knowledge, skills, understandings, and 
attitudes in the age groups as they relate to educational 
objectives . 

D. School administrators can make comparisons i tween 
groups, and m.ay improve student performance from the 
information gained i>^ this manner . 

From this review of the model, it is evident that the ^'ational 

Assessment staff has put a ^reat deal of effort and know-how into 

the design, plus the development of each of the com.ponents. 



Campbell, Vincent ^\ , Manford J. Ferris, and Daryl G. ^^Tichols, 
!ktional Assessm.ent Rejort 6, 1Q6Q-197Q Citizenship: Group 
Results for Sex, Hegion, an^ Size of GommU2iiti[, Education 
Commission of the States, National Assessment of Educational 
Progress, Denver, Colo., --Tuly, 1971 

-•^^^or^^is, Eleanor L., Vincent N. Campbell, Kanford J. Ferris, 
and Carmen J. Finley, National Assessment Report 9, 1969- 
1320 Citizenship.; Group Results for Parental Education, 
Color Size, and Tvoe of Comm unity , Education Commission 
of the States, National Assessment of Educational Progress, 
Denver, Colo., May, 1972. 



- 12 - 

USE 0? THE MODEL AT STATE A^'^D LOCAL LEVELS 



In p?^anni»^g for the collection of ^^tional Assessment data, 
the model, presented in the last section, was developed. A num.ber 
of states have found adaptations of the m.odel useful in conducting 
state assessments in which desirable learning outcomes are identi- 
fied and the status of learners v;ith respect to these outcom,es is 
determined. 

State assessment is a rapidly developing movement. At this 

v/riting, all of the states have assessment activities either in 

22 

operation, in a developmental process, or i^ a planning stage. 
V/hile the statewide assessment programs have many similarities, 
they break down into two basic types of programs on the question, 
"V/ho gets to use the results?^' The divisions are those states for 
which data are collected for decision making by state agencies and 
those states for which data are collected for decision making by 
teachers and administrators. 

In about a third of the states, the programs were mandated by 
the state legislatures, and the results of the assessments are to 
be reported back to the state legislatures. In a few of the states, 
the data are to be used for PPBS (Planning, pro<:^ramming , and 
Budgeting Systems). In about half of the states where the assessment 
data are being used to make state-level decisio-^s, state and federal 
funds will be allocated based on the results. Participation in 
assessment is required by law in about a fifth of the states. In 
the states v^ere the assessm.ent data are bei^g used to make 

^^State Educa tional Assessm ent Programs , 1973 Revision. Joan S. 
^Beers and Paul 3. Campbell, ^^Statewide Educational Assess- 
ment,'^ Educational Testing Service, Princeton, ^-^ew Jersey, 
1973, o. 1. 



- 13 - 

state-level decisions, samples rather than all students are being 
assessed, while in the local-level, decision-making states all 
students in the target populations are being assessed. Criterion- 
referenced instruments are very common v/ith the states where the 
data are being used for state-level decisions, but the states col- 
lecting information for local decision making are favoring norm- 
referenced instruments. Finally, no dominant funding pattern has 
evolved in either of the t\io groups of states. 

State Adaptations of the Model 

In its assessment of Citizenship education, Maine made an ex- 
tensive application of the '-National Assessment model and carefully 
duplicated it so that comparable data were collected at the state 
level. 2^ Maine's first cycle of the ten subject matter areas of 
''•'National Assessment (Art, Career and Occupational Development, 
Citizenship, Literature, Mathematics, Music, Reading, Science, 
Social Studies, and V/riting) is to be completed by scheduling two 
of these areas each year for five years. Citizenship and V/riting 
v/ere the first subject areas to be assessed. 

3ased on the results of a previous study of objectives for 
education in Maine, tv/o reviev/ com.mittees decided to accept the 
^'"^ational Assessment objectives as being closely related to the 
Maine objectives. Maine selected the 17-year-old population of 
in-school students for its first assessment. A sample of 2,000 
17-year-old students was used to represent the approximately 
17,000 17-year-old students in the State. The State was divided 

^3 itaid . , po. 2-3- 

2^/a^no As sessment of Educ ational Progress ; Me thoctology; (Report 5), 
O ~^Jepa^tment of Educational and Cultural Services, Augusta, 
ERiC Maine, 1972. 



- 1^ « 

into four geographical regions. As in '^^ational Assessment, school 
buildings were randomly selected from, geographic regions, and 
students were then randomly selected from, the buildings. Packages 
were developed with exercises taken from the two subject areas. The 
available, released exercises from -'^ational Assessment were carefully 
examined to see if they reflected objectives valid for Maine and to 
see if som.e could be m.odified, where needed, to be administered in 
group sessions using the paced-tape method while still retaining a 
high degree of comparability to the National Assessment individually 
administered exercises. The packages were made up of 23 Citizenship 
and seven V/riting exercises, plus a 23-item Student Questionnaire. 
The exercise format was kept virtually identical to the one used in 
■''ational Assessment. Trained administrators were sent out to ad- 
m.inister the exercises, and the exercises v/ere scored according to 
""^ational Assessment procedures. On data reporting and analysis, there 
was the census-like reporting of the performance of the Maine students 
plus comparisons of the Maine results with appropriate National 
Assessment data. 

In summ.ary, the Maine Assessment duplicated the National Assess- 
m.ent procedure as completely as possible. V/ith minor exceptions^ 
the same objectives were used for Citizenship. The same sampling 
design was used with adaptations to a smaller geographical area and 
population. The exercises were for the most part taken from those 
released by National Assessment, and they were organized into packages 
similar to those used by National Assessment. The administration and 
scoring of the exercises \srere conducted in the same manner as National 
Assessment had used. Since the same private contractors were used by 
Maine as were used by National Assessm.ent, the duplication was 



- 15 - 

complete wherever possible. The reporting and data analysis were 
similar, and the data did provide the opportunity to compare the 
results in Maine ^^^ith the results from National Assessment, 

Here, the model v/as very carefully duplicated at the state 
level. The big question v/hich cnm.es to mind after studying the 
Maine Citizenship report is, ^'Aren't the National Assessment data 
being treated here as some kind of a national norm against which 
the performances of 17-year-old students in Maine were being com- 
pared?" Of course, this use of National Assessment data had been 
questioned from the start of the proposal for an assessment at the 
national level. Now, Maine has provided the opportunity to study 
the effects of this use of the data on the educational system of 
a state. 

Another state which carefully followed the model was Connecticut. ^ 
Here, an assessment was first conducted in Reading. To permit compari- 
sons, the Connecticut program used available instruments and applicable 
procedures developed by National Assessment which were adapted to the 
requirements of the local situation. Connecticut's Reading objectives 
were matched to the Reading objectives of National Assessment. 
Approximately 220 reading exercises from National Assessment were 
used in producing the packages used in the Connecticut assessment. 
Exercises were selected to represent all of Connecticut's reading 
objectives. The aee groups assessed were 9, 13, and 17. As with the 
National Assessm.ent packages, tape-recorded instructions v/ere used* 
The sampling desip;n v/as a multi-staged design duplicating with few 
exceptions the National Assessment design. As with National Assessment, 

^^ Reoor t on the Assess ment of Reading. S kills of Connecticut Public. 
School Students, Institute for the Study of Inquiring Systems, 
O Phjladelohia, Pa., and Department of Education, Hartford, 

ERjC Conn., 1972 • 



- 16 ^ 

a group of administrators for the packages was recruited and 
trained. 

This was another example of careful duplication of the National 
Assessment model dovm to using the same objectives and exercises. 
Again, there v/as the use of the National Assessment results as norms 
to which the Connecticut results v^ere compared. 

The Texas Needs Assessment used the model for the developm.ent 
of their assessment in Mathematics at the sixth-grade level. " How- 
ever, while using ideas from the model, they broke with it in a 
number of places. The Texas people were concerned that the assess- 
ment would yield information which would be useful to teachers in 
their classroom instruction of students. From a pilot study, it 
was decided to use a criterion-referenced reading test and to work 
v/ith grade levels instead of age groups of students. They worked 
v;ith the sixth grade, and the tests were administered by the staff 
of each school which participated in the assessment. The objectives 
were chosen from the major skill areas treated in the state-adopted 
textbooks. Regional location and community size v/ere taken into 
consideration in selecting the sample. Approximately 10 per cent 
of the Texas schools teaching at the sixth-grade level administered 
tests, and approximately 10 per cent of the pupils being taught 
at the sixth-grade level were included in the sample. Reports were 
eiven to teachers on the performance of their individual students. 
Also, there was a. school report on the performance of the students 
for each school and a report on each of the classes in the school. 
Comparisons were m.ade on the basis of sex, race, and size of community. 

^^Sixth- Grade Math em atic s: A ^'^^eeds Ags essm.ent R eport , Texas Education 
Ac^ency, Austin, Texas, 1972. 

ERLC 



- 17 - 

The Colorado ^^Teeds Assessment, while using the model, made an 
even greater break with it.^'^ Its objectives v;ere based on a state 
study of educational goals, and the educational goals were restated 
in terms of performance objectives. Follov/ing the model, objective- 
referenced exercises were written. A sampling plan was used and 
the student responses were analyzed and reported. In this assess- 
m.ent, classroom teachers were involved in the writing and refinement 
of the behavioral objectives. Objective-referenced exercises were 
written for nine subject areas and the exercises were administered 
to a sample of 30,000 Colorado students. A stratified random sampl- 
ing procedure was used to select a sample of school districts of 
the State. Then schools were sleeted at random from the districts 
chosen. Fi*^ally, classes i*^ school buildings were randomly chosen 
for testing. The samples were representative of all Colorado students 
in grades 3, 6, Q, and 12. A group of proctors was hired and trained 
to administer the exercises, and the exercises v/ere scored by computer 
The data v/ere analyzed on a statev/ide and district basis, and the re- 
sults were broken down by subgroups, e.g. boys, girls, urban, rural. 

As pointed out earlier, states are rapidly moving into the 
assessment field. Som.e are reproducing the ^^ational Assessment model 
at the state level, and others are developing variations of the model. 
The more crude efforts have resulted in endless pages of raw per- 
ce->tages without any explanation of the results. Based on a survey 
of state assessment programs, Beers and Campbell identified several 
of the problems which are common to thesn state programs. ^'aturally 

^^Helper John W. , An Assessme nt of Learn er Ileeds in Colorado, 
Colorado Department of Education, Denver, Colo., 1^72. 

2^State Educational Assessm^ent Programs . 3eers and Campbell, og,. cit. , 
3» . 



. 18 - 

a shortage of money and staff were the most frequently mentioned 
problems, for it is a fact that many states have moved into this 
area v/ithout providing adequate funds for a realistic assessment 
program. Also, teacher resistance to assessment and negative public 
attitude tov/ard outside testing were problems mentioned. Test re- 
sults have been misused in the pas t , such as the firing of teachers 
based on incorrect interpretation of test results. Also, test de- 
velopers have been guilty of violating the privacy of students through 
questions which transgressed the examinee's human and legal rights. 
A third problem area has been with the utilization and dissemination 
of results. Some school officials do not understand the results. In 
some situations, there has been hostility to the results. Some of- 
ficials have ignored results in making decisions. Finally, results 
have freouently not reached the right people in a useable form. 

Use of the Model at the District Level 

To date, there has been a limited number of efforts reported 
on the use of the '"^ational Assessment model at the district level. 
Three such assessments on v/hich some data have been released are 
being conducted in Lincoln, ^'ebraska; San Bernardino, California; 
and Montgomery County, Maryland. 

In Lincoln, ^^ebraska, the exercises released by National Assess- 
ment in Citizenship and Writing were used in a local assessment which 
yielded data comparable to ^^ational Assessment data.^^'^^ A group 
of supervisory personnel from the central office identified the 
Gitizneship objectives v^ich were applicable to the Lincoln schools. 



^^'"//eeklv Focus.'* Lincoln Public Schools, Lincoln, Nebraska, 
February 12-19, 1973, ?• 3* 

O 30:3^^^^^ Ronald, Associate Suosr intendent for Instruction, Report 
ERJC ^ on Assessment Results to Board of Education, Lincoln Public 
™™i Schools, Lincoln, Nebraska, Spring, 1973- 



- 19 - 

Then the released National Assessment exercises were selected v;hich 
were applicable to these Lincoln objectives. Also, the National 
Assessment model v;as followed in selecting a random sample of 13 
year olds from the Lincoln junior high schools. In addition, a 
sample of in-school, 17 year olds were tested on some of the witing 
exercises. The n.dministration of the exercises v/as carried out by 
a group of specially trained district administrators, and the tape- 
paced method was used in presenting the exercises to the students. 
Scoring follov^ed the National Assessment procedures, and in reporting 
the results comparisons were made to National Assessment data with 
special attention given to comparable subgroups such as cities of 
similar size and the same geographic region. 

The San Bernardino City Schools developed a criterion-referenced 
assessment model of student progress which was based on the National 
Assessm.ent model. ' This model involved local teachers, students, 
and laymen; eight educational goals v;ere identified through the efforts 
of workshops involving teachers, students, and patrons. A Curriculum 
Task Force composed of 20 teachers wrote behavioral objectives for the 
goals to be appropriate for grades 3, 5, ^, and 12. National Assess- 
ment consultants assisted the teachers in developing exercises to 
assess the stated objectives at these grade levels. Also, the National 
Assessment consultants helped to design a sampling procedure to provide 
district-wide representation. The exercises v;ere or?!:a^ized into test 
batteries for each grade level. The Teacher Task Force administered 



^^Bonney, Lewis A., Aoolica tion of the National Assessment of 

Educational Progress Philosophy in San Bernardino "ity Unified 
School District,'^ Unpublished Paper, San Bernardino City 
Unified School District, San Bernardino, California. 

^^Soecial Curriculum Task Force, ^^Reoort on Student Performance,^' 
' Office of Instructional Services and Research and Development 
FRir Office, San Bernardino City Unified School District, San 

Bernardino, California, June, 1^72. 



- 20 - 

and scored the tests. This is a break from the ^'^ational Assessment 
practice of usine specially trained exercise administrators. The 
results were tabulated in terms of percentage of students meeting 
stated behavioral objectives. V/hile the I'-'^ational Assessment model 
was follov/ed in many ways, such as use of behavioral objectives, 
criterion-referenced assessment ins trumv^nts , and sampling of target 
populations, the assessment was designed for application at the 
local level^ and it was planned, developed, and carried out by local 
personnel. 

The Montgomery County Schools, Maryland, developed a program 
for assessing 13- and 17-year-old students.*^-' In this assessm^ent, 
the released ^^ational Assessm.ent exercises for V/riting were used. 
These v/ere adm.inistered in two group-package sessions to sam.ples 
of 13- and 17-year-old students. The results for Montgomery County 
students were compared to the results from the nationwide samplings 
of 13 and 17 year olds by National Assessment. One of the variations 
in the Montgomery County sam.pling design was stratification by I.Q. 
and grade level. The purpose was to spread the sample across the 
grade by scho^l-I.Q. p:roups; however, these sampling groups v/ere not 
used as reporting categories. Each age group (13 and 17) was strati- 
fied by I.Q. groups (low and nonlow) and by grade levels. 

Adaptation of the Model 

In the above discussion, it is evident that there will be as 
many adaptations of the model as there are local and state units 
conducting assessments. Probably, there is no specific assessment 

333ayiess, David L., Ralph E. Folsom, and Louise K. Lewis, "Sample 
Desjgn for Asr^essins Montgomery County Public Schools 13- 
and 17-Year-Old Pupils Using the ^^AEP Model," National 
Assessment of Educational Progress and Educational Commission 
© . of the States, Denver, Colorado, January, 1973* 

ERJC 



- 21 - 

model which is the best; hence, there is no model that should be 
applied v;ithoat modification in any and all situations. ^^^everthe- 
les."=5, there are principles of good assessment which should be 
aoplied in developing or adapting a model for local asses:^«ment 

0 

purposes. Listed below are some characteristics which should be 
found in a good assessment proG;ram.. 

1. The program has clearly defined goals that apply to a 
particular audience or audiences. 

2. The program has a realistic "number of goals which are 
attainable under the existing assessing conditions. 

3. The program has established priorities among its goals 

and places its major efforts on its ma.jor goals. 

h. The program has been designed to gather information 
considered to be important in education. 

5. The program has specific objectives which it is striving 
to attain. 

6. The program has been designed to provide results at a 
useable level of accuracy. 

7. The program has used data-gathering instruments which 
m.easLire the objectives of the assessment. 

8. The program has collected data in such a manner as to 
introduce a minimum of error i^ the results. 

Q. The program has scored and processed data in a>^ accurate 
manner . 

10. The program has used analytic techniques that provide 
the data breakdov/ns needed by decision makers. 

11. The program has reported results in a manner useable by 
its audience . 

12. The program has provided help in the interpretation of 
results and assistance in their implementation. 

13. The propram has provided for the active involvement of 
groups of persons from all of the m.ajor audiences for the assessment 
results . 

3H/om.er , Frank P^., Developing a. LaHSS. Scale Assessme nt Program, 
Coooerative Accountability Project, Denver, Colo., 1973, 

erIc 



- 22 - 



DISCUSSIO^^ OF THE MODEL AS IT APPLIES 
TO T?IE CITIZE^^SHIP ASSESSMENT 

As it v^as developed for the Citizenship Assessment, the model 
contained several problems. First, ten broad goals were identified 
as the basis for the ""'^ational Assessment of Citizenship. These 
goals ivere: 

1. Shov/ concern for the welfare and dignity of others. 

2. Support rights and freedoms of all individuals. 

3. Help maintain law and order. 

Knov/ the main structure and functions of our government. 

5. Seek community improvement through active democratic 
participation. 

6. Understand problems of international relations. 

7. Support rationality in communication, thought, and actions 
on social problems. 

8. Take responsibility for ov/n personal development and 
obligations . 

9. Help and respect their own families. 

10. "^urture the development of their children as future citizens. 
These goals were then developed in ereater detail through a set of 
behavioral objectives for each goal. A major question which needs 
to be asked about the model is, "Do the ten goals cover all of the 
dimensions of citizenship that we v;ish to include?" For example, 
there is no mention in the ten goals of the willingness of the in- 
dividual to hold public office. V/hile this is a value question, it 
is possible that there are some serious omissions in this list of 

^^•'orris, Eleanor L. (Ed.), Citizensh ip Objective s. Committee on 
Assessing the Progress of Education, Ann Arbor, Mich., 1969. 



- 23 - 

ten goals on which the behavioral objectives for the assessment were 
based. 

The behavioral objectives identified for Citizenship were of 
necessity general in nature. Yet, in a population as large and 
diverse as that of the United States, there are many subcultures. 
Undoubtedly, some groups of Americans hold citizenship beliefs v/hich 
were not reflected in the citizenship objectives which were identi- 
fied. This could be true for Indians living within a tribal structure 
on the reservations, a^d this could be equally true for other groups 
for whom the statement of behaviors x-^ere not acceptable as being 
descriptive of good citizenship. 

Again, the exercises did not describe sta^-^dards of behavior to 
which all people in the country could subscribe. The exercises only 
reflected behaviors which ^//ere widely held in the population. One of 
the obvious difficulties here is that some cultural differences are 
viewed as undesirable departures from the norm.. I>-^ other v/ords, if 
whi':e, Anglo-Saxon protestants from the affluent suburbs responded 
in a certain ma>^ner, then that v;as considered to be the desirable 
response, and the responses of other respondents v/ere compared to 
the so-called desired respo'^se. The black population has been 
especially concerned about this, and they have protested the manner 
in which their responses have been reported for both the Southeastern 
part of the country and the core area of the large metropolitan cities. 
If this is truly a census of the knowledges, attitudes, skills, and 
understandings of Americans, then the responses of one f!;roup should 
not be negatively compared with the responses of other groups and 
geographic areas in the country. 

Other areas of concern with respect to the model are the limi- 
ERiOations of some of the technical procedures used. Based on the broad 



- 21+ - 

goals identified for the Citize^^ship a?ses3i?.e^ t , specific objectives 
v/ere written. The process of \»rriti*^g a*^d screening these objectives 
included certain difficulties. As stated in the model, this acti- 
vity v/as carried out at a center in Califor^^ia. The initial identi- 
fication of objectives reflected the k^owledee ann backgrou*-d of the 
snail group of specialists at the center, '7ith this sonewhat biased 
origin, the objectives were the^^ screened by several groups of people 
who v/ere selected to be more represe^-tative of the general population, 
but a review of the ^ames, addresses, and occupations of these panel 
members raised questio-^s as to hov^ really representative they were. 
In additio^'^, ""'ational Assessment has very strong political overtones; 
hence, everythi*^g eve>^tually had to be screened for its political 
implications. Objectives v/hich v;oald be offensive to one group or 
another in the country were screened out so that the e^^tire operation 
had, from necessity, a certain blandness about it. 

Still another technical difficulty with the model was the trans- 
mission problem from one component to the '-ext, 0-e group of special- 
ists developed the objectives a^d a'^other group wrote the exercises, 
He->ce, the accurate interpretation of the objectives through the 
exercises was another difficulty. In addition, there v;as the tech- 
nical problem of actually vrriti^g exercises which truly reflected 
the behavioral objectives. For sone objectives, it was very difficult 
to write an exercise v/hich would deno-strate if the subject could per- 
form the desired citizenship behavior, Furthermore, the exercise 
writer night not completely understa-d the i-te^^r of the objective: 
hence, he failed to present it accurately in the exercises. 

Yet another technical problem involved the actual scoring of the 
exercises, A third group of specialists was hired to score the 

ERiC 



. 25 - 

exercises. This probably presented little difficulty in the case of 
the multiple-choice exercises, but v;ith the ooen-ended kind of 
exercises, a decisio*^ had to be reached as to what v/as a good answer, 
a not so .9;ood answer, a*^d a poor answer. Then scorers were trained 
to score the exercises, ^''aturally, reliability checks were run on 
the scorers. V/hile it v;as true that a hig'n level of consistency was 
achieved with respect to hov/ they scored the exercises, the scoring 
was still approached from the perspective of a particular group of 
people vfho had been trained to take a particular orientation to 
scori'^g the responses to the ex^^-rcises. The perspectives of the 
various sub-culture groups withi^ our entire society probably were 
not accurately included withi»^ the scorers' perspectives of how the 
exercises should be scored. This ki»^d of problem alv/ays exists in 
soori'^g open-ended, exercises, a»^d in this situation where a very v;ide 
variety of peoole were i»^cluded i»^ the sample, u»^doubtedly many 
questio-'S v/ere scored from a frame of refere-^ce completely alien to 
that of some of the original respo^^ders. 

collecti-'g the data o'-* the out-of-school Frroups, the adminl- 
stratto*^ of the intervievrs presented still another problem. Admittedly, 
this was the best procedure for reachi»^R the out-of-school population, 
a^-^d it hc>d the adva^taee of person- to-person contact. These are very 
oosltive traits of the interview procedure; however, the procedure 
has inhere»-'t problems v/hen it comes to reachi»^g members of sub- 
cultural groups. Those persons v/ho were selected for this work and 
t^^ai^ed Tor it undoubtedly possessed characteristics which made them 
acceptable to the enterprise. On the other hand, these very traits 
made them unacceptable to certai»^ of the sub-cultural groups with 
whom they would have tc communicate. It is hirhly probable that the 



. . - 26 - 

interviewers v/ere >^ot equally successful with all i*^ terviewees a>^d 
probably were '^r^able to co>^tact certain groups in our population 
who v;ere hostile to them. This probably had an impact o^ the 
sampling results . 

Finally, zhe co^^cept of ce'^sus-like , value-free reporting is 
questioned. It is the writer's opi*^ion that the reporti»^g was con- 
ducted i^ as objective a ma»^ner as possible, but that at times such 
things as the organization of the reports, the wording of the reports, 
and the emohasis in the reporting reflected value positions of the 
perso'^s at '""ational Assessment v/ho v/ere doing the technical writing. 
After all, these individuals are all v/ell educated members of the 
educational establishment. They will filter all of the information 
reported through their own cu].tural perspectives in writing the re- 
ports. ' Also, they are sensitive people who are trying to present 
the information as clearly, honestly, and acceptably as possible. 
They are going to state things in such a manner as not to be offensive 
to any particular group, thus there is additional filtering of the 
information v^hlch has implications for v/hat is reported. 

In summary, the human element enters into the model at a number 
of points introducing subjective variables. These are in the se- 
lection of goals, the v/riti^g of behavioral objectives, the writing 
of the exercises, the adminis tratio'- of the open-ended interview 
exercises, the scoring of the open-ended exercises, and the reporting 
of the data. This is not to say that '^"ational Asses sm.ent has not 
succeeded in carrying out as objective an assessment as they could, 
but it is necessary to point out some of the places where subjective 
kinds of variables could have been introduced i-to the model. 



- 27 - 

DIRGUSSIO^^ CF THa MODEL ?CR RTATS A^^D LOCAL USES 

The assessment model has potential for promoting curriculum 
development. This is especially true v/hen it is applied to state 
or local situations in the manner used in Colorado and San 
Bernardino. In these two situations, objectives were developed 
v^hich specifically apolied to the local situation. The statement of 
well v/ritten objectives in behavioral terms may sharpen the purposes 
of instruction. Through the experience of writing behavioral ob- 
jectives, the curriculum worker gai^^s a much clearer perception of 
his tasV;; hence, this practice may have a beneficial im.pact on cur- 
riculum v/ork. On the other hand, the use of behavioral o>^]ectives 
has not always been a positive inf lue>^ce . The objectives may zero 
in on easily defi>^ed behaviors v;hich lack scope and significance. 
They may produce tunnel vision, and put stress the inconsequential 
and trivial. In an effort to be specific and to define the exact 
behaviors desired, the larger perspective may be lost. 

Again, the development of exercises from the identified be- 
havioral objectives may have a positive influence on curriculum. 
The ki^d of new, in-.ovative exercises which have been developed by 
'•National Assessme-'t may have a very positive influence on v/hat is 
being taught and how it is being taught. Teachers both in reviewing 
exercises v/hich have been used in ^'^ational Assessment and in writing 
exercises for local assessments m.ay be influenced in their selection 
of both content and methods by their knov/ledge of these assessm.ent 
exercises. Material not relevant to the objectives of the course may 
be dropped, and methodologies promoting the kind of skills needed in 
the as3essm.ent exercises may be introduced. 



ERIC 



0^ the other hand, the results may be less desirable. If in 
local and state situations the dictates of finances or the lack of 
leadership results in the use of poorly written, machine-scored, 
multiple-choice exercises, the results may be very negative. Teach- 
ers may feel pressured to stress rote learning of facts i^ order to 
'prepare their students for poorly v/ritten examinations. Hence, 
poorly written exercises may keep irrelevant material in the curri- 
culum and limit curriculum innovation and development. The quality 
of the exercises v/ritten and released will have an impact on cur- 
riculum development. 

Good sampling procedures m,ay give j.nsight into the status of 
knov;ledge, unders ^,andings , skills, and attitudes of students in a 
particular target population. This can promote curriculum, improve- 
m.ent and in->ovation. Problem areas in the curriculum may be identi- 
fied. From the ^-^ational Assessment, there have been som.e problem 
areas identified in the Citizenship results. On an exercise dealing 
v/ith freedom of speech, a large percentage of 13, 17, and adult age 
groups indicated that they v/ould not allov/ sample controversial state- 
ments to be m.ade on radio or TV. This showed a lack of under stand- 
incr or valuing of the Constitutional ri^ht of freedom to express con- 
troversial or unpopular opinions. 

The results o*^ the Citizenship assessment indicated that black, 
urban students in our large cities compeared poorly on knowledge about 
the structure and function of t^overnment to the national average per- 
formance on the sam.e exercises.*^ 
36 

Campbell, Vincent ^\ , et al, Reoor t 2, C itizenshi p: rational 

Results , Education Commission of the States, "^'ational Assessment 
of Educational Proe^ress, Denver, Colo., ^^ovember, 1970, p. 

■^"^^^^orris, Eleanor L., Vincent M. Campbell, Manford J. Ferris, and 
Carmen J. Fin ley, Rational Assessment Reporj 9j 1Q69 - 1Q70 
Citizenship ; Group. Results for Parental Education , Colo r, 
^ » fLize, and T^iae of ^ Community , Education Commission of the 
ERJC states, National Assessment of Educational Progress, Denver., 
Colo,, May, 1072, pp, 63-65> 



- 2<5 - 

On the other hand, there are potential difficulties with 
assessment data which represent national levels of performance. 
Even though the data were not collected with this intention and 
were reported in census-like form, the results of ^'ational Assess- 
ment are being treated like national norms. Several states have 
conducted their own assessments duplicating the National Assessment 
model so that they can make direct comparisons between their state 
results and the various national, regional, and subgroup results. 
There is the potential of great mischief in this approach, for it 
may lead to inaccurate comparisons betv/een groups, states, and re- 
gions. In the assessment reports of some states, tables of per- 
centa9;es have been presented without any interpretation or expla- 
nation. Some school systems have been presented in a very bad v/ay 
v/ithout any reference being made to the kinds of variables involved 
in the different learning situations. Such variables as per pupil 
expenditures, educational level of parents, and motivation of 
pupils do have a^ impact on the learning situation. These and 
other variables should not be ignored in interpreting the results 
of assessm.ent. 

riere, it is not being suggested that assessments should ^ot 
be conducted because there are potential misuses of the data but 
those engaged in assessment at national, state, and local levels 
have the responsibility to be constantly engaged in an educational 
program to report data in the proper perspective and to aid those 
using the data to make correct interpretations of it. V/e need these 
data for decision making, but if they are misused or misinterpreted, 
then the decisions based on them may not be good ones. 



ERIC 



Finally, v/here accountability is being applied to a total 
organization such as a school, a district, or a state, the ^^ational 
Assessment model may be used v/ith little or no modification. It 
was designed to accurately establish v^hat the level of performance 
on a given set of objectives was in a population, and it can be 
used to do this for accountability purposes as well as assessment 
purposes, Like\^ise, it can assess subgroups of the population and 
identify specific strengths or weaknesses in the performance of a 
given subgroup. The model is an excellent instrument for carry- 
ing out accountability in this kind of situation. 



ERIC 



